wo 2005/050620 



PCT/IB2004/052334 



1 

MATCHING DATA OBJECTS BY MATCHING DERIVED FINGERPRINTS 



FIELD OF THE INVENTION 

The invention relates to a method and apparatus for matching fingerprints. 

BACKGROUND OF THE INVENTION 
5 Fingerprinting technology is used to identify media content (such as audio or 

video). An audio or video segment is identified by extracting a fingerprint fi'om it, and 
searching the extracted fingerprint in a database in which fingerprints of known contents are 
stored. Content is identified if the similarity between the extracted fingerprint and the stored 
fingerprint is deemed suflQcient. 

10 The prime objective of multimedia fingerprinting is an efficient mechanism to 

establish the perceptual equality of two multimedia objects: not by comparing the (typically 
large) objects themselves, but by comparing the associated fingerprints (small by design). In 
most systems using fingerprinting technology, the fingerprints of a large number of 
multimedia objects along with its associated metadata (e.g. in the case of song information, 

15 name of artist, title and album) are stored in a database. The fingerprints serve as an index to 
the metadata The metadata of unidentified multimedia content are then retrieved by 
computing a fingerprint and using this as a query in the fingerprint/metadata database. .The 
advantage of using fingerprints instead of the multimedia content itself is three-fold: reduced 
memory/storage requirements as fingerprints are relatively small; efficient comparison as 

20 perceptual irrelevancies have already been removed fi-om fingerprints; and efficient searching 
as the data set to be searched is smaller. 

A fingerprint can be regarded as a short summary of an object Therefore, a 
fingerprint fiinction should map an object X consisting of a large number of bits to a 
fingerprint F of only a limited number of bits. There are five main parameters of a fingerprint 

25 system: robustness; reliability; fingerprint size; granularity; and search speed (or scalability). 

The degree of robustness of a system determines whether a particular object 
can be correctly identified from a fingerprint in cases where signal degradation is present. In 
order to achieve high robustness the fingerprint F should be based on perceptual features 
which are invariant (at least to a certain degree) with respect to signal degradations. 
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Preferably, a severely degraded signal will still yield a similar fingerprint to a fingerprint of 
an original undegraded signal. The "false rejection rate" (FRR) is generally used to express 
the measure of robustness of the fingerprinting system. A false rejection occurs when the 
fingerprints of perceptually similar objects are too different to lead to a positive 
5 identification. 

The reliability of a fingerprinting system refers to how often an object is 
identified falsely. In other words, reliability relates to a **felse acceptance rate" (FAR) - i.e. 
the probability that two different objects may be falsely declared to be the same. 

Obviously, fingerprint size is important to any fingerprinting system. In 
10 general, the smaller the fingerprint size, the more fingerprints can be stored in a database. 
Fingerprint size is often expressed in bits per second and determines to a large degree the 
memory resources that are needed for a fingerprint database server. 

Granularity is a parameter that can depend on the application and relates to 
how long (large) a particular sample of an object is required in order to identify it. 
IS Search speed (or scalability), as it sounds, refers to the time needed in order 

to find a fingerprint in a fingerprint database. 

The above five basic parameters have a large impact on each other. For 
instance, to achieve a lower granularity, one needs to extract a larger fingerprint to obtain the 
same reliability. This is due to the fact that the false acceptance rate is inversely related to. 
20 . fingerprint size. Another example: search speed will generally increase when one designs a 
more robust fingerprint 

Having discussed the basic parameters of a fingerprinting system, a general 
description of a typical fingerprinting system is now made. 

A fingerprint may be based on extracting a feature-vector fi*om an originating 
25' audio or video signal. Such vectors are stored in a database witti reference to the relevant 
metadata (e.g. title, author, etc.). Upon reception of an unknown signal, a feature-vector is 
extracted fi-om the unknown signal, which is subsequently used as a queiy on the fingerprint 
database. If the distance between the query feature-vector and its best match in the database 
is below a given threshold, then the two items are declared equal and the associated metadata 
30 are returned: i.e. the received content has been identified. 

The threshold that is used in the matching process is a trade-off between the 
false acceptance rate (FAR) and the false rejection rate (FRR). For instance, increasing the 
threshold (i.e. increasing the acceptable "distance" between two fmgerprints for those 
fingerprints to still be judged similar) increases the FAR, but at the same time it reduces the 



wo 2005/050620 



PCT/IB2004/052334 



3 

FRR. The trade-oflf between FAR and FRR is usually done via the so-called Neyman-Pearson 
approach. This means that the threshold is selected to have the smallest value which keeps 
the FAR below a pre-specified, allowable level. The FRR is not used for determining the 
threshold, but merely results firom the selected threshold value. 

5 US 2002/0178410 Al (Hdtsma, Kalker, Baggen and Oostveen) discloses a 

method and apparatus for generatmg and matching fingerprints of multimedia content. In this 
document, it is described on page 4 thereof how two 3 second audio clips are declared similar 
if the Hamming distance between two derived fingerprint blocks Hi and H2 is less than a 
certmn threshold value T. 

10 In order to analyse the choice of the threshold T, the authors of US 

2002/0178410 assume that the fingerprint extraction process yields random i.i.d. 
(independent and identically distributed) bits. The number of bit errors will then have a 
binomial distribution with parameters (n, p) where n equals the number of bits extracted and 
p (=0.5) is the probability that a 0 or 1 bit is extracted. Since n is large, the binomial. 

15 distribution can be approximated by a normal distribution with a mean fi=np and a standard 
deviation a = ^np{\-p). Given a fmgerprint block Hi, the probability that a randomly 

selected fingerprint block H2 has less than an errors with respect to Hi is then given by: 

..... . - 

FAR = 4=.r, e^dx = ierfcfi^V^V^^rf^f^l 

However, in practice robust fingerprints have high correlation along the time 
20 axis. This may be due to the large time correlation of the underlying video sequence, or the 
overisqp of audio frames. Experiments for audio fingerprints show that the number of 
erroneous bits is normally distributed, but that the standard deviation is approximately 3 
times larger than the i.i.d, case. Equation (1) therefore is modified to include this factor 3. 

FAR = lerfc fi^^^iXl (2) 
2 1^ 3V2n J 

25 The above approach assumes that the distribution between the fingerprints is 

stationary. Although this seems to be a reasonable assumption for certain technologies, this is 
definitely not the case for video fingerprinting. In video fingerprinting, the amount of 
"activity" in the video is directly reflected m the correlation of the fingerprint bits: prolonged 
stills lead to constant (i.e., very highly correlated) fingerprints, whereas a "flashy" music clip 

30 will lead to a very low correlation between the fingerprint bits. This non-stationarity leads to 
problems in determining an appropriate value for the threshold. 
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OBJECT AND SUMMARY OF THE INVENTION 

It is an aim of embodiments of the present invention to propose an 
arrangement for providing an ad^tive thresholding technique. 

5 According to a jBrst aspect of the invention, there is provided a method of 

comparing a query fingerprint to a candidate fingerprint, the method being characterised by 
comprising: determining a statistical model of the query fingerprint and/or a candidate 
fingerprint; and on ttie basis of the statistical model, deriving a threshold distance within 
which the query fingerprint and the candidate fingerprint may be declared similar. 

10 A second aspect of the invention provides a method of matching a query 

object to a known object, wherein a plurality of candidate fingerprints representing a plurality 
of candidate objects are pre-stored in a database, the method comprising receiving an 
information signal forming part of the query object and constructing a query fingerprint 
therefrom and comparing the query fingerprint to a candidate fingerprint in the database, the 

15 method being characterised in that it further comprises the steps of: determining a statistical 
model for the query fingerprint and/or the candidate fingerprint; and on the basis of the 
. statistical model, deriving a threshold distance within which the query fingerprint and the 
candidate fingerprint may be declared similar. 

In the methods of the first and second aspects, the derivation of a threshold 

20 based upon a statistical model of the particular fingerprint provides adaptive tiireshold setting 
which may optimise the F A.R. according to query fingerprint type/ internal characteristics 
giving improved matching qualities over the Explication of an arbitrary thresholding system. 

Preferably, if a candidate fingerprint is foimd to be separated fi-om the query 
fingerprint by a distance less than the threshold distance, and the distance between the 

25 candidate and the query fingerprint is less than the distance between any other candidate 
fingerprint and the query fingerprint, then the candidate fingerprint is declared the best 
matchmg candidate fingerprint and the candidate object represented by the best matching 
candidate fingerprint and the query object represented by the query fingerprint are deemed to 
be the same. 

30 Preferably, the statistical model comprises the result of performing an internal 

correlation on the query fingerprint and/or the candidate fingerprint. 

Preferably, the fingerprints comprise binary values and the statistical model is 
computed for the query fingerprint by determining a transition probability q for the queiy 
fingerprint by determining how many bits of a query fingerprint fi-ame F(mjc) are different 
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from their corresponding bit in their preceding fingerprint frame F(mjc-1) and dividing the 
number of transitions by a maximum value M*(k-1), which would be obtained if all 
fingerprint bits were of an opposite state to their corresponding preceding bit, where each 
fingerprint comprises M bits per fi-ame and spans K fi-ames, in which k is the fi-ame index 
(ranging from 0 to K) and m is the bit-index within a fi-ame (ranging from 0 to M). 

The threshold distance T may then be computed from the following equation 
based on a desired False Acceptance Rate (FAR): 



to a known object, the ^paratus comprising a fingerprint extaction module for receiving an 
information signal forming part of a query object and constructing a query fingerprint 
therefrom and a fingerprint matching module for comparing the query fingerprint to 
candidate fingerprints stored in a database to one or more candidate fingerprints, the 
apparatus being characterised in that it fijrther comprises: a statistical module for determining 
a statistical model of the query fingerprint and/or one or more of the one or more candidate 
fingerprints; a threshold determiner ^deriving on the basis of the statistical model, a threshold 
distance T within which the query fingerprint and a candidate fingerprint may be declared 
similar; and an identification module arranged such that if a candidate fingerprint is foxmd to. 
be separated from the query fingerprint by a distance less than the threshold distance T, and 
the distance between the candidate and the query fingerprint is less than the distance between 
any other candidate fingerprint and the query fingerprint, then the candidate fingerprint is 
declared the best matching candidate fingerprint and the candidate object represented by the 
best matching candidate fingerprint and the query object represented by the query fingerprint 
are deemed to be the same. 

BRIEF DESCRIPTION OF THE DRAWINGS 

For a better understanding of the invention, and to show how embodiments of 
the same may be carried into effect, reference will now be made, by way of example, to the 
accompanying diagrammatic drawings in which: 

Figure 1 shows a functional block diagram illustrating a fingerprinting method 
with an adaptive threshold in accordance with an embodiment of the invention; 



FAR 




(4) 



In a third aspect, the invention provides apparatus for matching a query object 
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Figure 2 is a flow diagram explaining in general the process involved in 
finding and matching fingerprints in accordance with an embodiment of the invention; 

Figure 3 is a flow diagram illustrating in general the methodology for 
determining an adaptive threshold in accordance with an embodiment of the present 
5 invention; and 

Figure 4 is a flow diagram illustrating a specific adaptive threshold setting 
methodology in accordance with embodiments of the invention. 

DESCRIPTION OF EMBODIMENTS 

10 Referring to Figure 1, there is shown a functional block diagram divided into a 

client side 100 and a database server side 200. At the client side, an object is received by a 
fingerprint extraction module 110 and a query fingerprint F computed for the object. The 
query fingerprint F is, on the one hand, passed to an statistical module 120 and, on the other 
hand, also passed to the database server side 200. The statistical module 120 determines a 

15 measure of randomness/correlation (for instance, it may determine the internal correlation) of 
the query fingerprint F and passes this information to a threshold determiner 130. The 
threshold determiner 130, on the basis of the information fi*om the module 120 adaptively - 
sets a threshold level T and passes this threshold level T to the database server side 200. 

At the database server side 200, a matching module 210 receives the query 

20 fingerprint F fi-om the client side 100 and looks for the best match of that fingerprint within a 
database of known fingerprints. The best match information is then passed to a threshold 
comparison module 220 to determine whether a best matching candidate fingerprint is close 
enough (within threshold distance T) to the query fingerprint to determine the identity of the 
input object wilh the matched object corresponding to the candidate fingerprint. In the case 

25 where the fingerprint F takes binary values, the threshold comparison module 220 might, for 
instance, compare the Hamming distance between a fingerprint block Hi and a fingerprint 
block H2 relating to the best match in the database 210 and check to see whether the 
Hamming distance between the two blocks is below the threshold distance T, supplied to the 
comparison module 220 fi-om the threshold determining module 130. An identification 

30 decision is made by identification module 230 so that if the Hamming distance between the 
two derived fingerprint blocks is below the threshold distance T then the unidentified query 
object is declared similar to the object found in the database and the relevant metadata is 
returned. 
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In the above description the query fingerprint F and the threshold T are sent by 
the client side 100 to the database server side 200. Here, of course, it could be noted that the 
threshold T could also be determined at the database server side 200 and that, therefore, 
modifications of the aforementioned block diagram are of course possible. 
5 Referring now to Figure 2, ttiere is shown a flow diagram which explains, in 

general, the operation of the components of the block diagram of Figure 1 in finding and 
matching fingerprints. 

In a step SlOO, an object sample ( e.g. in the case of video a short "clip") is 
received and a query fingerprint determined based upon the sample. This query fingerprint 

10 may be determined in accordance with any suitable prior art method (such as disclosed in 
US 2002/0178410 Al). In a step S200 (reached by pathway "A"), a threshold for the query 
fingerprint is determined in accordance with the particular characteristics 
(randomness/correlation) of the query fingerprint 

In a step S300, which may be carried out in parallel with step S200, the query 

15 fingerprint is matched to fingerprints held on the database server side 200, to return a:best 
matching candidate. Again, this matching process may be performed conventionally, so as to 
return the closest match to the query fingerprint. 

In the step S300, the "distance" between the query fingerprmt and the best 
match candidate will be determined and, in a step S400, it is checked whether or not the 

20 "distance" is less than the threshold distance determined in step S200. If the distance between 
the query fingerprint and tiie best match candidate is foimd in step S400 to be greater than the 
threshold, then in step SSOO the result is returned that no matching object to the query object . 
has been found. On the other hand, if the distance between query fingerprint and best match 
candidate fingerprint is less than the threshold distance in step S400, then in step S600 a 

25 match is declared between the query object and the object in tiie database relating to the best 
matching candidate. Metadata etc., of the best matching object may then be returned to a 
user. 

In Figure 2, the pathway "A." denoted by the broken lines leading to step S200 
fi*om SlOO denote one option for setting a threshold T=T1 based on the query fingerprint. 
30 Alternatively however, pathway "A" may be disregarded and a threshold T=T2 may be based 
upon the characteristics of the best matching candidate. This possibility is denoted by the 
alternative pathway B from 8300 to S200. 
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In a further alternative, the threshold T may be set based upon a combmation 
of the characteristics of both the queiy fingerprint and the best matching candidate fingerprint 
e.g. by setting a threshold at the average between two derived adaptive thresholds Tl, T2. 

Figure 3 is a flow diagram illustrating the general methodology for adaptively 
5 determining a given threshold T. 

In step S210, the query candidate fingerprint is received and a measure of 
randomness of the fingerprint determined, then in step S220 a threshold distance is set 
according to the measure of randomness found in step S210. 

As will be appreciated from the above and fi'om the explanation in relation to 

10 Figure 1, the threshold value T (Tl or T2) used in the comparison is adapted to the 

randomness/correlation in either the query-fingerprint or/and the best matching candidate. 
More specifically, in the case of threshold determination for a query fingerprint, the 
correlation of the query fingerprint is determined and, fi*om this correlation, the threshold to 
be used during matching is computed. The less random the internal correlation is found to be, . 

15 the smaller the threshold distance T can be set without adversely aflFecting the FRR. 

As stated, the threshold is determined upon the internal correlation of the 
query fingerprint, a best matching candidate fingerprint or a combination of the two. In cases . 
where the fingerprint is binary and the fingerprint-bits behave like a Markov-process, a 
solution can be derived for adaptively setting the threshold. 

20 The solution to the adaptive threshold setting problem is shown in Figure 4. In 

a step S221, the internal correlation of the fingerprint in question is determined, in step S222 
the transition probability for the fingerprint is determined based upon the internal correlation . 
and in step S223, the threshold distance is set adaptively, based upon both the transition 
probability (explained below) and a desired false acceptance rate. 

25 Let the fingerprint consist of M bits per fi*ame and span K fiames. In this case, 

the fingerprint can be denoted F(m,k), where k is the fi-ame index (ranging fi'om -0 to K-1) 
and m is the bit-index within a frame (ranging from 0 to M-1). Let q denote the probability 
that a fingerprint-bit extracted from frame k is unequal to the corresponding fingerprint bit 
from fi^me k-1 by {q=Prob[ bit(m,k) bit(m,k-l) ]), This probability q is called the 

30 transition probability. In this case the correlation increases (compared to the case of purely 
random bits, in which q=l/2) by a factor 



wo 2005/050620 



PCT/IB2004/052334 



9 

As a consequence, the False Acceptance Rate FAR is described by the relation 

FAR = i- erfc 
2 

Use of the above relation for computing an adaptive threshold from the desired 
FAR and the computed transition probability q may be summarised as follows: 
5 Extract fingerprint F 

Determine the transition probability q for fingerprmt F, as follows: 

(a) Determine how many of the fingerprint bits F(m,k) are different fi-om their 
predecessor F(m,k-1). 

(b) Divide the nimiber of transitions, as computed in step (a) by the theoretical 

1 0 maximum M*(K-1), which would be obtained if for each firame, all fingerprmt bits would be 
tfie opposite fi-om the bits in the previous fi^e to determine the transition probability q = 
(number ofbit'tr€msitiofvs)/(M^(K-l)), 

Determine the threshold T which is to be used for matching this specific query 
fingerprint F fi-om tihe computed value q, and a defined pre-agreed False Acceptance Rate 
1 5 using relation (4). 

From the above, the threshold T may be adaptively set for T=T1 (based on 
correlation of query fingerprint above), or T=T2 (based on correlation of best match 

fingerprint above), or T=a3 (based on a combination of Tl, T2 [e.g. T = (^^ + ^^) ]. Then, in 

the decision stage if the Hamming distance is less than T, declare the underlying objects to be 
20 the same. 

In the above specific examples of the present invention the threshold distance 
is set adaptively based on the internal characteristics of a particular query sample or, indeed, 
of a particular candidate sample or set of samples. However, whilst the specific examples 
described take the internal characteristics in question to be randomness/correlation, it will be 

25 realised that other types of statistical distribution might apply to certain types of information 
signal and that, therefore, the invention may be le^timately extended to providing adaptive 
thresholds accordmg to any given applicable "statistical model" to which a query sample or a 
candidate sample fingerprint is expected to conform. 

Further, the skilled man will realise that whilst the Figure 2 through 4 fiow 

30 diagrams show one arrangement for implementing the invention, other arrangements are 
possible. For instance, rather than returning a single best match candidate in step S3Q0 of 
figure 2, a plurality of close matching candidates withm a threshold distance may be returned 



1 ~ 2T / l + (1 ■ 
V2n" ^1 _ (1 - 



-2q)- 
2q)^ 



(4) 



wo 2005/050620 



PCT/IB2004/052334 



and processed in parallel (or less advantageously in series) to thereafter calculate the "besf' 
match. The invention can also be applied using so-called '•pruning'* techniques in which 
certain candidates within the database can be immediately discarded if it is obvious that they 
can never make a match - searching/matching can then be done within a much reduced 
5 search space. 

In accordance with embodiments of the invention^ methods and apparatus for 
setting an adaptive threshold are disclosed, in which the threshold depends upon specific 
characteristics of a fingerprint. The particular method is very suitable for use in matching of 
video content, but is not limited to this. The techniques described may be applied to various 

10 different areas of technology and various different signal types, including, but not limited to, 
audio signals, video signals, multimedia signals. 

The skilled man will realise that the processes described may be implemented 
in software, hardware, or any suitable combination. 

In summary, the invention relates to methods and apparatus for fingerprint 

15 matching. In an embodiment of the invention apparatus comprising a fingerprint extraction 
module (1 10), a fingerprint matching module (210), a statistical module (120) and an 
identification module is provided. The fingerprint extraction module (110) receives an 
information signal forming part of a query object and constructs a query fingerprint The 
fingerprint matching module (210) compares the query fingerprint to candidates stored in a 

20 database (215) to find at least one potentially best matching candidate. Meanwhile, the 

statistical module determines a statistical model of the query fingerprint so as to, for mstance, 
determine the statistical distribution of the queiy fingerprint. The threshold determiner (120) 
is arranged, on the basis of the distribution of the query fingerprint to derive an adsqptive 
threshold distance T within which the query fingerprint and a potentially best matching 

25 candidate may be declared similar by the identification module (130). By setting a threshold 
in an adaptive manner according to the statistical distribution of the query fingerprint, an 
improved false acceptance rate F. A.R. and other advantages may be achieved. 



